Skip to main content

Utility Scripts

General-purpose utilities for working with geospatial raster data, including metadata generation, statistical analysis, and size calculations.

gdal2metadata.py

Generates FGDC (Federal Geographic Data Committee) metadata XML from GDAL-supported raster files.

Usage

python gdal2metadata.py [options] in_Geo.tif in_FGDCtemplate.xml output.xml

Parameters

in_Geo.tif
string
required
Input georeferenced raster file
in_FGDCtemplate.xml
string
required
Input FGDC metadata template XML file
output.xml
string
required
Output populated FGDC metadata XML file

Options

-debug
flag
Print detailed image information during processing
-mm
flag
Compute and report min/max values
-stats
flag
Compute and report statistics
-hist
flag
Report histograms

Metadata Fields Populated

Automatically extracts and populates:

Spatial Information

  • Coordinate system - Projection name and parameters
  • Datum and spheroid - Target body, radii, flattening
  • Resolution - Pixel size in degrees or meters
  • Extent - Bounding coordinates (westbc, eastbc, northbc, southbc)
  • Image dimensions - Rows and columns

Projection Parameters

Supports and extracts parameters for:
  • Equirectangular
  • Mercator
  • Transverse Mercator
  • Sinusoidal
  • Robinson
  • Stereographic
  • Polar Stereographic
  • Orthographic

Technical Details

  • Data type and bit depth
  • NoData values
  • Scale and offset

Example

# Basic metadata generation
python gdal2metadata.py input.tif fgdc-template.xml output_metadata.xml

# With statistics and debugging
python gdal2metadata.py -debug -stats input.tif fgdc-template.xml output_metadata.xml

Template File

Requires an FGDC XML template with placeholder elements. The script populates:
<horizsys>
  <planar>
    <mapproj>
      <mapprojn>Equirectangular</mapprojn>
      <equirect>
        <stdparll>0</stdparll>
        <longcm>0</longcm>
      </equirect>
    </mapproj>
  </planar>
</horizsys>

Coordinate Normalization

Longitudes are automatically normalized to -180 to 180 range for FGDC validation compliance.

Requirements

Dependencies:
  • Python 2.7+ or Python 3.x
  • GDAL Python bindings
  • lxml or xml.etree for XML processing
# Install lxml (recommended)
conda install -c conda-forge lxml

gdal_hist.py

Exports raster histogram data in tab-delimited format for analysis and visualization.

Usage

python gdal_hist.py [options] datasetname

Parameters

datasetname
string
required
Input raster dataset

Options

-mm
flag
Compute and display min/max values
-stats
flag
Compute and display statistics (min, max, mean, stddev, RMS)
-hist
flag
Export histogram data (required)
-unscale
flag
Apply scale and offset to unscale values to original units
At least one flag (-mm, -stats, or -hist) must be specified.

Output Format

Statistics Output

Min=0.00, Max=50.00, Mean=15.23, StdDev=8.45, RMS=17.42

Histogram Output

level    value      count    cumulative
0        0.00       1234     0.012340
1        0.20       2345     0.035790
2        0.40       3456     0.070350
...
Columns:
  • level - Bin number
  • value - Center value of bin
  • count - Number of pixels in bin
  • cumulative - Cumulative percentage (0-1)

Example

# Export histogram only
python gdal_hist.py -hist slope.tif > slope_histogram.txt

# Statistics and histogram
python gdal_hist.py -stats -hist dem.tif > dem_analysis.txt

# Unscaled values
python gdal_hist.py -unscale -stats -hist scaled_data.tif > unscaled_stats.txt

Scale and Offset

When -unscale is used:
value = (raw_value × scale) + offset
Reads scale and offset from raster metadata.

Multi-band Files

For multi-band files, each band is processed separately:
Band 1 Block=256x256 Type=Float32, ColorInterp=Gray
Min=0.00, Max=50.00, Mean=15.23, StdDev=8.45, RMS=17.42
level    value      count    cumulative
...

slope_histogram_cumulative_graph.py

Creates histogram and cumulative distribution visualizations from tabular histogram data.

Usage

python slope_histogram_cumulative_graph.py [options] input.tab output.png

Parameters

input.tab
string
required
Input tab-delimited histogram file (from gdal_hist.py)
output.png
string
required
Output PNG image file

Options

-name
string
Title for the plot
-name "Mars Landing Site A"

Input File Format

Expects tab-delimited format from gdal_hist.py:
level    value    count    cumulative
0        0.00     1234     0.012340
1        0.20     2345     0.035790

Output Visualization

Generates a dual-axis plot:
  • Left Y-axis - Frequency histogram (gray filled area)
  • Right Y-axis - Cumulative distribution (blue line)
  • X-axis - Value (typically slope in degrees)
  • Vertical line - Reference line at x=15 (suitable for slopes)

Example

# Generate histogram from DEM slope
python gdal_hist.py -hist slope.tif > slope_hist.txt

# Create visualization
python slope_histogram_cumulative_graph.py -name "Site Alpha" \
  slope_hist.txt slope_histogram.png

Customization

The script can be modified to:
  • Change reference line position (default: x=15)
  • Adjust color scheme
  • Modify axis labels
  • Change figure size

Requirements

Dependencies:
  • Python 3.x
  • pandas
  • matplotlib
conda install -c conda-forge pandas matplotlib

gdalSize.py

Calculates uncompressed raster file size based on geographic extent and resolution.

Usage

python gdalSize.py minlong minlat maxlong maxlat resolution bitType bands infile

Parameters

minlong
float
required
Minimum longitude (degrees)
minlat
float
required
Minimum latitude (degrees)
maxlong
float
required
Maximum longitude (degrees)
maxlat
float
required
Maximum latitude (degrees)
resolution
float
required
Resolution in meters per pixel
bitType
integer
required
Bit depth: 8, 16, or 32
bands
integer
required
Number of bands
infile
string
required
Reference image for projection information

Output

Reports estimated file size:
1234.5 in Megabytes
1.2 in Gigabytes

How It Works

  1. Reads projection from reference image
  2. Transforms geographic bounds to projected coordinates
  3. Calculates dimensions based on resolution
  4. Computes uncompressed size: lines × samples × bands × bytes_per_pixel

Size Calculation

Bit DepthBytes per Pixel
81
162
324
Size = (extent_x / resolution) × (extent_y / resolution) × bands × bytes_per_pixel

Example

# Calculate size for Mars region
# Bounds: -180 to 180 lon, -90 to 90 lat
# Resolution: 100 m/pixel
# 16-bit, 1 band
python gdalSize.py -180 -90 180 90 100 16 1 mars_reference.tif
Output:
49152000.0 in Megabytes
46.9 in Gigabytes

Use Cases

  • Storage planning - Estimate disk space before processing
  • Data ordering - Calculate download sizes
  • Processing planning - Determine memory requirements
  • Cost estimation - Calculate cloud storage costs

Limitations

  • Calculates uncompressed size only
  • Actual compressed size varies by format and compression
  • Does not account for tile/block overhead
For compressed size estimates:
  • GeoTIFF with LZW: ~30-50% of uncompressed
  • JPEG2000: ~10-20% of uncompressed
  • Cloud-optimized formats: add ~5-10% for overviews

gdal2AsciiLatLonBands.py

Exports raster band data to ASCII/CSV format with optional latitude/longitude or XY coordinate columns.

Usage

python gdal2AsciiLatLonBands.py [-srcwin xoff yoff width height] [-band N] [-addheader] [-printLatLon] [-printYX] srcfile [dstfile]

Parameters

srcfile
string
required
Input raster file
dstfile
string
Output ASCII/CSV file (writes to stdout if not specified)

Options

-srcwin xoff yoff width height
integers
Extract subset window (offsets and dimensions in meters)
-band N
integer
Band number to export (can be specified multiple times for multiple bands). Defaults to band 1.
-addheader
flag
Add header row with field names
-printLatLon
flag
Include Lat/Lon columns (uses GDAL projection to calculate)
-printYX
flag
Include Y,X columns in meters

Examples

# Export band 1 to stdout
python gdal2AsciiLatLonBands.py input.cub

Output Format

Without coordinates:
Band1
12.3
15.7
18.2
With header and Y,X:
Y, X, Band1
1000.0, 2000.0, 12.3
1000.0, 2010.0, 15.7
1000.0, 2020.0, 18.2
With Lat/Lon and multiple bands:
Lat, Lon, Band1, Band2, Band3
45.123, -122.456, 12.3, 45.6, 78.9
45.124, -122.455, 15.7, 48.2, 81.3

Use Cases

  • Point sampling - Extract band values at specific locations
  • Data integration - Create CSV for import to databases or spreadsheets
  • Validation - Compare pixel values across different datasets
  • Analysis input - Generate point clouds for statistical analysis

Common Workflows

Complete Slope Analysis Workflow

# 1. Calculate slope
python gdal_baseline_slope.py -baseline 5 input_dem.tif slope.tif

# 2. Generate statistics
python gdal_hist.py -stats -hist slope.tif > slope_data.txt

# 3. Create visualization
python slope_histogram_cumulative_graph.py -name "Site Analysis" \
  slope_data.txt slope_plot.png

Metadata Documentation Workflow

# 1. Generate metadata
python gdal2metadata.py input.tif fgdc_template.xml metadata.xml

# 2. Validate FGDC metadata
mp -e metadata.xml  # If USGS MP tool is available

# 3. Create HTML view
xsltproc fgdc-html.xsl metadata.xml > metadata.html

Data Planning Workflow

# 1. Calculate storage needs
python gdalSize.py -180 -85 180 85 100 16 3 reference.tif

# 2. Estimate processing time
# Use size estimate to calculate processing duration

# 3. Plan tile scheme
# Based on size, determine optimal tile size

Installation

Basic Installation

# Install GDAL and core dependencies
conda install -c conda-forge gdal numpy

Full Installation (All Utilities)

# Install all dependencies
conda install -c conda-forge gdal numpy scipy pandas matplotlib lxml

Verify Installation

# Test GDAL
python -c "from osgeo import gdal; print(gdal.__version__)"

# Test other libraries
python -c "import numpy, pandas, matplotlib; print('OK')"

Requirements Summary

ScriptPythonGDALNumPyPandasMatplotliblxml
gdal2metadata.py2.7+---
gdal_hist.py2.7+----
slope_histogram_cumulative_graph.py3.x---
gdalSize.py2.7+----
gdal2AsciiLatLonBands.py2.7+----

Troubleshooting

Install GDAL Python bindings:
conda install -c conda-forge gdal
Ensure:
  • Input file has valid projection
  • Template XML is valid FGDC format
  • Longitude values are in -180 to 180 range
Check:
  • Input file is tab-delimited
  • Contains required columns: level, value, count, cumulative
  • matplotlib backend is properly configured
Verify:
  • Reference image has correct projection
  • Bounds are in correct order (min before max)
  • Resolution units match projection units

Author

Developed by Trent Hare and contributors at USGS Astrogeology Science Center.

License

Public domain (Unlicense) unless otherwise specified.